Home
Technology
AI
From Text To Video:...

From Text To Video: Amazon's Nova Models Take Generative AI To New Heights

Amazon's Nova family of foundation models promises groundbreaking advancements in text, image, and video AI capabilities

5 Dec 2024 10:10 AM IST

At AWS re:Invent, Amazon CEO Andy Jassy introduced the Nova series—six cutting-edge AI foundation models designed to revolutionize generative AI. Spanning text, image, and video generation, models like Nova Canvas and Nova Reel set new benchmarks for innovation. With capabilities like "Any-to-Any" modality and speech-to-speech AI planned for 2025, Amazon is poised to lead the generative AI wave

Amazon announced a new generation of foundation models at AWS' annual showcase re:Invent, as Amazon CEO Andy Jassy took to the stage to spotlight the "explosion" of generative AI and the tech giant's bold bets. Amazon Web Services (AWS) is the cloud computing arm of Amazon.

The new set of AI foundation models are called `Nova'. Amazon Nova Micro, Amazon Nova Lite, and Amazon Nova Pro are generally available today, while Amazon Nova Premier will be available in the Q1 2025 timeframe. "...We have continued to work on our own frontier models and those frontier models made a tremendous amount of progress over the last 4-5 months.

And we figured if we were finding value out of them, you would probably find value out of them...so I am excited to share and announce the launch of Amazon Nova which are new state-of-the-art foundation models...," Jassy said at AWS' annual flagship event re:Invent. Amazon Nova Micro is a "laser-fast" and cost-effective text-to-text model, while multimodal models - Amazon Nova Lite, Amazon Nova Pro, and Amazon Nova Premier can process text, images, and videos to generate text.

"We prioritise technology that we think will really matter for customers and with the explosion of generative AI over the last couple of years we have taken the same approach... there is a tonne of innovation, what we are trying to do is solve problems for you, what we think of as practical AI," he said. Jassy - the former CEO of AWS (Amazon's cloud computing unit) - also announced image-generation model, Amazon Nova Canvas and video-generating model Amazon Nova Reel.

"That are six new frontier models. So what is going to be next for us in Nova? The team is going to be working really hard next year on the second generation of these models...but I also have a couple of things that I am going to give you a sneak peek into," he said as he promised the availability of Amazon Nova `speech-to-speech model' in Q1 timeframe, and Amazon Nova `Any-To-Any' model sometime midyear.

Amazon Nova speech-to-speech model scheduled in the first quarter of 2025 is designed to transform conversational AI applications by understanding streaming speech input in natural language, interpreting verbal and non-verbal cues (like tone and cadence), and delivering natural human-like, back-and-forth interactions with low latency.

Amazon Nova model with native multimodal-to-multimodal – or `any-to-any' modality capabilities will take text, images, audio, and video as input, and generate outputs in any of these modalities. "You'll be able to input text, speech, images, or video and output text, speech, images, video...This is the future of frontier models are going to be built and consumed, and we look forward to giving this to you," Jassy said announcing the `any-to-any' model that is in the works. According to the company, Amazon Nova models have been tested against a wide range of industry standard benchmarks. Amazon Nova Micro, Amazon Nova Lite, and Amazon Nova Pro perform quite competitively against the best models in their respective categories, it said. Amazon Nova Canvas is a image generation model that creates professional grade images from text or images provided in prompts.

Amazon Nova Canvas also provides features that make it easy to edit images using text inputs, and provides controls for adjusting color scheme and layout. "The model comes with built-in controls to support safe and responsible AI use. These include features like watermarking, which allows the source of an image to always be traced, and content moderation, which limits the generation of potentially harmful content.

Amazon Nova Canvas performs better than image generators such as OpenAI DALL-E 3 and Stable Diffusion in side-by-side human evaluations conducted by a third party, and on key automated metrics," the release said. Amazon Nova Reel is a video generation model that allows customers to easily create high-quality video from text and images, the company said adding it is ideal for content creation in advertising, marketing, or training.

Amazon Nova Reel currently generates six-second videos, and will support the generation of videos of up to two-minutes in length in the coming months. AWS made a slew of announcements and updated on new capabilities as CEO Matt Garman highlighted how the company is delivering innovations in generative artificial intelligence (AI), including new Trainium2 instances, Trainium3 chips, and Amazon Nova foundation models. "There has never been a better time to be innovating and you have never had access to such a rich set of capable tools...," Garman said addressing the event.

AWS re:Invent Amazon Nova generative AI multimodal AI AI foundation models